Temporal Pattern Mining in Logistics

نویسندگان

  • Andreas D. Lattner
  • Tjorben Bogon
  • René Schumann
  • Ingo J. Timm
چکیده

Modern technologies like RFID, GPS, and wireless networks provide means for an automated tracking of goods, containers, and transportation vehicles. Collecting data about positions and movements of different actors and objects is a prerequisite for an automated analysis of ongoing logistic processes. As extensive information about objects and their relations is available, it is possible to apply data mining techniques in order to identify patterns in the data. In this work, we analyze the requirements for mining patterns in the logistics domain and present an algorithm for mining temporal patterns and prediction rules from complex representations of scenes. An implementation of the mining approach is applied to two test scenarios in order to create and test prediction rules. INTRODUCTION In logistics scenarios, there exist many different kinds of objects like transport vehicles (e.g., trucks, ships, or planes), different actors or organizations (e.g., depots, carriers, manufacturers), highways or tracks, transshipment stations, etc. Different events can occur in the dynamic environment like traffic jams, weather events, break downs of transport vehicles, or delivery delays of some goods. It would be very valuable to identify repeating patterns that lead to certain situations in order to predict, for instance, that for the given situation it is likely that rescheduling will be necessary later on. Having this information it would be possible to initiate some counter-actions earlier in order to avoid financial loss or penalty payments. Such a pattern could be, for instance: If the capacity load of depot X is high and transportation vehicles Y on the way to X has a break-down, it is likely that some goods Z have to be transferred to some other depot. Modern technologies like RFID, GPS, and wireless networks provide means for an automated tracking of goods, containers, and transportation vehicles. Having available data about positions and movements of different objects enables an automated analysis of ongoing logistic processes. If extensive information about objects and their relations is available, it is possible to apply data mining techniques in order to identify patterns in the data. Patterns can be transformed to prediction rules by estimation of causal interdependencies. If it was known that certain situations lead to some future events or situations with some probability, this information could be used to improve the decision making process in logistic scenarios, for instance by avoiding critical situations. The logistics domain features high complexity w.r.t. different objects and relations appearing in scenes on the one hand and demands means for the representation of temporal information on the other hand. The requirements for learning such patterns are manifold. On the one hand, the complex situations need a sophisticated representation formalism in order to capture the scenes and to represent the patterns. On the other hand, the patterns should be comprehensible and easily applicable to future situations. The requirements are: • The scene representation has to provide means to represent objects, their properties, as well as interrelations among objects. • Due to the dynamics, an explicit representation of the temporal dimension is needed. As different relations and events can exist concurrently, it is required that the representation can also deal with such concurrent situations. • Capturing conceptual information about object’s classes and possible interrelations is needed to guide the search for patterns. • Background knowledge is necessary in order to cover general rules of the domain without demanding the user to provide (redundant) knowledge about each individual object (truck, depot etc.). • Unsupervised identification of generic frequent patterns, i.e., abstracting from the concrete objects and identifying common patterns. • Generation of prediction rules to be applied to future situations. • Pattern matching methods in order to detect patterns (or preconditions of prediction rules). • Evaluation of created prediction rules. The paper is structured as follows: Section 2 presents related work, addressing learning in logistics and pattern mining approaches. Our pattern mining approach is described in section 3. In section 4, we give an example scenario and show created patterns before conclusions are drawn in section 5. RELATED WORK In logistics scenarios usually a huge amounts of data are involved. Different learning approaches have been used in order to handle this data and to improve the solutions of logistics problems. For instance, neural networks are used to optimize traditional logistics problems like the Travelling Salesman Problem, routing problems, bankruptcy prediction, and dispatching problems (Wilppu, 1999). Another task in logistics is to manage and control a supply chain. With the rising connectivity of the world (airplains, ships) the supply chains spread wide over all continents. Bruzzone and Orsoni describe three different approaches how costs in supply chains could be reduced using techniques from the areas of artificial intelligence, stochastic and mathematics (Bruzzone and Orsoni, 2003). Another approach to handle the complexity of global supply chains is described in (Pontrandolfo et al., 2002). Pontrandolfo et al. use Reinforcement Learning to let different sites learn to work efficiently together. Every site is represented by an agent who acts with semi-Markov decision processes. Traffic route planning problems can also be described with a multiagent system. Gehrke and Wojtusiak present an approach to react online on influences from the environment (for example weather) (Gehrke and Wojtusiak, 2008a, b). The approach tries to identify the best routes by taking into account the wetness and the speed limits of the roads. Every truck is represented by an agent and can dynamically react to new events. The (propositional) rule induction system AQ21 has been used to set up prediction rules. In the recent years, different learning approaches have been presented which satisfy the requirements described above partially. In the field of Inductive Logic Programming (ILP), various approaches have been presented that can deal with relational representations. ILP approaches like FOIL and Progol (Muggleton, 1995; Quinlan, 1990) are supervised and thus do not identify frequent patterns from data as desired. The rule learner WARMR combines ideas from the fields of association rule mining and ILP (Dehaspe and Toivonen, 1999) but does not provide means for explicit representation of the temporal dimension. Different approaches to sequential or temporal pattern mining can be found, e.g., (Agrawal and Srikant, 1995; Höppner, 2003; Mannila et al., 1997). But most of these approaches cannot deal with relational representations as required in our case. Jacobs and Blockeel apply the ILP association rule learner WARMR in order to mine shell scripts from Unix command shell logs (Jacobs and Blockeel, 2001). Although having a complete different domain, the resulting patterns can represent complex interrelations and temporal relations (sequences). Log files of shells can be seen as a sequence of commands. Frequent patterns from such command sequences can be interpreted as shell scripts. The challenge is to deal with the arguments in commands and use variables in patterns in order to represent that the same argument (e.g., a file name) should be used by a different command. Jacobs and Blockeel use WARMR for the generation of scripts and also present some methods for speedup by splitting up the learning task and using the so called minimal occurrence algorithm. Command sequences are represented by a stub relation with unique identifier, execution time, and command (e.g., stub(1,2008,’cp’)) and parameter relations (e.g., parameter(1,1,’file1’) and parameter(1,2,’file2’)). The learning task is split up by first finding the frequent sequences of commands and then taking the parameter information into account. The minimal occurrence algorithm takes into account the sequential information in the query generation process and can thus prune some of the patterns that cannot be frequent any more. It also utilizes the identifiers at which sequences of the previous step start for the calculation of occurrences of a new sequence (Jacobs and Blockeel, 2001). Masson and Jacquenet address the mining of frequent logical sequences (Masson and Jacquenet, 2003). They extend the SPIRIT system (Garofalakis et al., 1999) to discover logical sequences and introduce the SPIRIT-LOG algorithms. The major adaptations w.r.t. SPIRIT are done in the generation and pruning functions. Candidate generation is extended to handle logical sequences with variables and in the pruning step an inclusion test between logical sequences (including unification of variables) are developed. SPIRIT-LOG can only create patterns of contiguous predicates, i.e., no gaps are allowed in the sequential patterns. Furthermore, it is not possible to use background knowledge in the form of clauses (cf. (Lee and De Raedt, 2004)). Lee and De Raedt introduce the logical language SeqLog for the representation of sequential logical data (Lee, 2006; Lee and De Raedt, 2004). The sequence itself is represented as a sequence of logical atoms. Additionally, it is possible to specify background knowledge as DATALOG style clauses. The mined patterns consist of a sequence of logical atoms. Two adjacent atoms in the pattern can be specified as direct (temporal) neighbors or allow for having other elements between them (denoted by the < symbol). Lee and De Raedt introduce the mining system MineSeqLog which mines the borders of the solution space for an input sequence and a conjunction of monotonic and anti-monotonic constraints on the patterns (Lee and De Raedt, 2004). However, this mining approach cannot mine any patterns with concurrent occurrences of events or activities. MINING TEMPORAL PATTERNS The representation of scenes used here is based on Allen’s theory of action and time (Allen, 1984); it is a set of time intervals representing different events and relations between objects in the scene with a temporal validity interval. An example for such a representation is shown in Fig. 1. The temporal dimension is shown on the x axis. The intervals (like “capacity_load(s1, high)”) determine the validity of certain relations or activities. Formally, there exists a start and end time for each interval. As it can be seen in Fig. 1, there are various intervals “active” concurrently. capacity_load(s1, high) capacity_load(s1, low) capacity_load(s1, high) break_down(c1) position(c2, a2) position(c2, a3)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Calculations of Geometric Parameters and Investigations of its Geomorphological Changes Pattern in Kashkan River

Changes in a river's geomorphologic pattern would be considered in all environmental plannings because this change has special consequences on land ownership properties, the use of cultivable lands on rivers' banks and supply of constructional materials. Sand and gravelmining of the river beds are among factors that lead to changes in river bed patterns. In this research, we investigate geomorp...

متن کامل

Gene Expression Data Analysis Using Data Mining Algorithms for Colon Cancer

The concept of Data mining is used in various medical applications like tumor classification, protein structure prediction, gene classification, cancer classification based on microarray data, clustering of gene expression data, statistical model of protein-protein interaction etc. Adverse drug events in prediction of medical test effectiveness can be done based on genomics and proteomics throu...

متن کامل

Data Mining in Sequential Pattern for Asynchronous Periodic Patterns

Data mining is becoming an increasingly important tool to transform enormous data into useful information. Mining periodic patterns in temporal dataset plays an important role in data mining and knowledge discovery tasks. This paper presents, design and development of software for sequential pattern mining for asynchronous periodic patterns in temporal database. Comparative study of various alg...

متن کامل

An Overview of Temporal Data Mining

Temporal Data Mining is a rapidly evolving area of research that is at the intersection of several disciplines, including statistics, temporal pattern recognition, temporal databases, optimisation, visualisation, high-performance computing, and parallel computing. This paper is first intended to serve as an overview of the temporal data mining in research and applications.

متن کامل

Temporal Databases and Frequent Pattern Mining Techniques

Data mining is the process of exploring and analyzing data from different perspective, using automatic or semiautomatic techniques to extract knowledge or useful information and discover correlations or meaningful patterns and rules from large databases. One of the most vital characteristic missed by the traditional data mining systems is their capability to record and process time-varying aspe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008